Collective Elaboration of a Coreference Annotated Corpus for Portuguese Texts

نویسندگان

  • Evandro Brasil da Fonseca
  • Vinicius Sesti
  • Sandra Collovini
  • Renata Vieira
  • Ana Luisa Leal
  • Paulo Quaresma
چکیده

This paper describes the collaborative creation of a corpus with coreference annotation for Portuguese. The annotation was performed using the coreference annotation CORP, and the editing tool CorrefVisual. The texts were automatically annotated and manually revised by Portuguese speakers. As a result a new corpus for coreference studies was produced for Portuguese.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nominal Coreference Annotation in IberEval2017: The Case of FORMAS Group

This work describes the participation of the FORMAS group from Federal University of Bahia (UFBA) in the Shared Task on Collective Elaboration of a Coreference Annotated Corpus for Portuguese Texts for IberEval 2017. As such, it describes the creation of a corpus annotated with coreference information for the Portuguese language. We discuss the choices adopted oin the annotation process, as wel...

متن کامل

Can Projected Chains in Parallel Corpora Help Coreference Resolution?

The majority of current coreference resolution systems rely on annotated corpora to train classifiers for this task. However, this is possible only for languages for which annotated corpora are available. This paper presents a system that automatically extracts coreference chains from texts in Portuguese without the need for Portuguese corpora manually annotated with coreferential information. ...

متن کامل

The Coreference Annotation of the CSTNews Corpus

We report in this paper the coreference annotation process of the CSTNews corpus as part of a collective task of the IberEval 2017 conference. The annotated corpus is composed of 140 news texts written in Brazilian Portuguese language and counts with several annotation layers, including annotations in the morphosyntax/syntax, semantics, and discourse levels. The annotation, focused on nominal r...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Corpus for Coreference Resolution on Scientific Papers

The ever-growing number of published scientific papers prompts the need for automatic knowledge extraction to help scientists keep up with the state-of-the-art in their respective fields. To construct a good knowledge extraction system, annotated corpora in the scientific domain are required to train machine learning models. As described in this paper, we have constructed an annotated corpus fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017